Explorations on Positionwise Flag Diacritics in Finite-State Morphology
نویسنده
چکیده
A novel technique of adding positionwise flags to one-level finite state lexicons is presented. The proposed flags are kinds of morphophonemic markers and they constitute a flexible method for describing morphophonological processes with a formalism that is tightly coupled with lexical entries and rule-like regular expressions. The formalism is inspired by the techniques used in two-level rule compilation and it practically compiles all the rules in parallel, but in an efficient way. The technique handles morphophonological processes without a separate morphophonemic representation. The occurrences of the allomorphophonemes in latent phonological strings are tracked through a dynamic data structure into which the most prominent (i.e. the best ranked) flags are collected. The application of the technique is suspected to give advantages when describing the morphology of Bantu languages and dialects.
منابع مشابه
Constraining Separated Morphotactic Dependencies In Finite-State Grammars
[Morphology, Morphotactics, Finite State, Separated Dependencies] This paper examines dependencies between separated (non-adjacent) morphemes in naturallanguage words and a variety of ways to constrain them in finite-state morphology. Methods include running separate constraining transducers at runtime, composing in constraints at compile time, feature unification, and the use of FLAG DIACRITIC...
متن کاملAutomated Lossless Hyper-Minimization for Morphological Analyzers
This paper presents a fully automated lossless hyper-minimization method for finitestate morphological analyzers in Xerox lexc formalism. The method utilizes flag diacritics to preserve the structure of the original lexc description in the finite-state analyzer, which results in reduced size of the analyzer. We compare our method against an earlier solution by Drobac et al. (2014) which require...
متن کاملMorphology and two-lewel rules and negative rule features
Two-level phonology, as currently practiced, has two severe limitations. One is that phonological generalizations are generally expressed in terms of transition tables of finite-state automata, and these tables are cumbersome to develop and refine. The other is that lexical idiosyncrasy is encoded by introducing arbitrary diacritics into the spelling of a morpheme. This paper explains how phono...
متن کاملHeuristic Hyper-minimization of Finite State Lexicons
Flag diacritics, which are special multi-character symbols executed at runtime, enable optimising finite-state networks by combining identical sub-graphs of its transition graph. Traditionally, the feature has required linguists to devise the optimisations to the graph by hand alongside the morphological description. In this paper, we present a novel method for discovering flag positions in mor...
متن کاملFinite-State Morphological Analysis for Marathi
This paper describes the development of free/open-source morphological descriptions for Marathi, an Indo-Aryan language spoken in the state of Maharashtra in India. We describe the conversion and usage of an existing Latin-based lexicon for our Devanagari-based analyser, taking into account the distinction between full vowels and diacritics, that is not adequately captured by the Latin. Marathi...
متن کامل